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Background of the Invention 

Field of the Invention . The invention is generally related to digital video 
transmission systems and is specifically directed to a method and apparatus for 
displaying, mapping and controlling video streams distributed over a network for 
supporting the transmission of live, near real-time video data in a manner to maximize 
display options through remote control from a monitoring station. 

Discussion of the Prior Art . Prior art video security systems typically use a 
plurality of analog cameras, which generate composite- video signals, often in 
monochrome. The analog video signals are delivered to a centralized monitoring station 
and displayed on a suitable monitor. 

Such systems often involve more than one video camera to monitor the premises. 
It is thus necessary to provide a means to display these multiple video signals. Three 
methods are in common use: 

• Some installations simply use several video monitors at the monitoring 
station, one for each camera in the system. This places a practical limit on 
the number of cameras that the system can have. 

• A time-sequential video switcher may be used to route multiple cameras to 
one monitor, one at a time. Such systems typically 'dwell' on each 
camera for -several seconds before switching to the next camera. This 
method obviously leaves each camera unseen for the majority of the time. 

• Newer systems accept several simultaneous video input signals and 
display them all simultaneously on a single display monitor. The 
individual video signals are arranged in a square grid, with 1, 4, 9, or 16 
cameras simultaneously shown on the display. 



[0004] A typical prior art system is the Multivision Pro MV-96p, manufactured by 

Sensormatic Video Products Division. This device accepts sixteen analog video inputs, 
and uses a single display monitor to display one, four, nine, or sixteen of the incoming 
video signals. The device digitizes all incoming video signals, and decimates them as 
necessary to place more than one video on the display screen. The device is capable of 
detecting motion in defined areas of each camera's field of view. When motion is 
detected, the device may, by prior user configuration, turn on a VCR to record specific 
video inputs, and may generate an alarm to notify security personnel. 

[0005] While typical of prior art systems, the device is not without deficiencies. First, 

video may be displayed only on a local, attached monitor and is not available to a wider 
audience via a network. Second, individual videos are recorded at a lower frame rate 
than the usual 30 frames/second. Third, video is recorded on an ordinary VHS-format 

! 1 cassette tape, which makes searching for a random captured event tedious and time- 

consuming. Finally, the system lacks the familiar and commonplace User Interface 
typically available on a computer-based product. 

- D06 ^ With the availability of cameras employing digital encoders that produce 

industry-standard digital video streams such as, by way of example, MPEG-1 streams, it 

; . is possible to transmit a plurality of digitized video streams. It would be, therefore, 

desirable to display any combination of the streams on one or more video screens. The 
use of MPEG-1 streams is advantageous due to the low cost of the encoder hardware, and 

:;: to the ubiquity of software MPEG-1 players. However, difficulties arise from the fact 

that the MPEG-1 format was designed primarily to support playback of recorded video 
from a video CD, rather than to support streaming of 'live' sources such as surveillance 
cameras and the like.MPEG system streams contain multiplexed elementary bit streams 
containing compressed video and audio. Since the retrieval of video and audio data from 
the storage medium (or network) tends to be temporally discontinuous, it is necessary to 
embed certain timing information in the respective video and audio elementary streams. 
In the MPEG-1 standard, these consist of Presentation Timestamps (PTS) and, optionally, 
Decoding Timestamps (DTS). 

[0007] On desktop computers, it is common practice to play MPEG-1 video and audio 

using a commercially available software package, such as, by way of example, the 
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Microsoft Windows Media Player. This software program may be run as a standalone 
application. Otherwise, components of the player may be embedded within other 
software applications. 

[0008] Media Player, like MPEG-1 itself, is inherently file-oriented and does not support 

playback of continuous sources such as cameras via a network. Before Media Player 
begins to play back a received video file, it must first be informed of certain parameters 
including file name and file length. This is incompatible with the concept of a continuous 
streaming source, which may not have a filename and which has no definable file length. 

[0009] Moreover, the time stamping mechanism used by Media Player is fundamentally 

incompatible with the time stamping scheme standardized by the MPEG-1 standard. 
MPEG-1 calls out a time stamping mechanism which is based on a continuously 
incrementing 94 kHz clock located within the encoder. Further, the MPEG-1 standard 
assumes no Beginning-of-File marker, since it is intended to produce a continuous 
stream. 

P0010] Media Player, on the other hand, accomplishes time stamping by counting 100's 

I of nanoseconds since the beginning of the current file. 

Summary of the Invention 

[00011] The subject invention is directed to an IP -network-based surveillance and 

monitoring system wherein video captured from a number of remotely located security 
cameras may be digitized, compressed, and networked for access, review and control at a 
remote monitoring station. The preferred embodiment incorporates a streaming video 
system for capturing, encoding and transmitting continuous video from a camera to a 
display monitor via a network includes an encoder for receiving a video signal from the 
camera, the encoder producing a high-resolution output signal and a low-resolution 
output signal representing the video signal, a router or switch for receiving both the high- 
resolution output signal and the low-resolution output signal and a display monitor in 
communication with the router for selectively displaying either the high-resolution output 
signal or the low-resolution output signal. It will be understood by those skilled in the art 
that the terms "router and/or switch" as used herein is intended as a generic term for 
receiving and rerouting a plurality of signals. Hubs, switched hubs and intelligent routers 
are all included in the terms "router and/or switch " as used herein. 
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[00012] In the preferred embodiment the camera videos are digitized and encoded in three 

separate formats: motion MPEG-1 at 352x240 resolution, motion MPEG-1 at 176x112 
resolution, and JPEG at 720x480 resolution. Each remote monitoring station is PC-based 
with a plurality of monitors, one of which is designated a primary monitor. The primary 
monitor provides the user interface function screen and the other, secondary monitors are 
adapted for displaying full screen, split screen and multiple screen displays of the various 
cameras. Each video stream thus displayed requires the processor to run an instance of 
the video player, such as by way of example, Microsoft Media Player. A single Pentium 
III 500 MHz processor can support a maximum of 16 such instance, provided that the 
input video is constrained to QSEF resolution and abitrate of 128 kb/s. 

[00013] The novel user interface functions of the system interact with the system through 

the browser. Initially, a splash screen occurs, containing the login dialog. A check box is 
provided to enable an automatic load of the user's last application settings. After logon, 
the server loads a series of HTML pages which, with the associated scripts and applets, 
provide the entire user interface. Users equipped with a single-monitor system interact 
with the system entirely through the primary screen. Users may have multiple secondary 
screens, which are controlled by the primary screen. In the preferred embodiment the 
primary screen is divided into three windows: the map window; the video window and 
the control window. 

[00014] The primary screen map window contains a map of the facility and typically is a 

user-supplied series of one or more bitmaps. Each map contains icons representing 
cameras or other sensor sites. Each camera/sensor icon represents the position of the 
camera within the facility. Each site icon represents another facility or function site 
within the facility. In addition, camera icons are styled so as to indicate the direction the 
camera is pointed. When a mouse pointer dwells over a camera icon for a brief, 
predefined interval, a "bubble" appears identifying the camera. Each camera has an 
associated camera ID or camera name. Both of these are unique alphanumeric names of 
20 characters or les and are maintained in a table managed by the server. The camera ID 
is used internally by the system to identify the camera and is not normally seen by the 
user. The camera name is a user-friendly name, assigned by the user and easily 



changeable from the user screen. Any user with administrator privileges may change the 
camera name. 

In the preferred embodiment, the map window is a pre-defined size, typically 510 
pixels by 510 pixels. The bit map may be scaled to fit with the camera icons accordingly 
repositioned. 

When the mouse pointer dwells over a camera icon for a brief time, a bubble 
appears which contains the camera name. If the icon is double left clicked, then that 
camera's video appears on the primary screen video window in a full screen view. If the 
icon is right clicked, a menu box appears with further options such as: zone set up; 
camera set up; and event set up. 

When the mouse pointer dwells on a site or sensor icon for a brief time a bubble 
appears with the site or sensor name. When the icon is double left clicked, the linked site 
is loaded into the primary screen with the previous site retained as a pull down. Finally, 
the user may drag and drop a camera icon into any unused pane in the primary screen 
video window. The drag and drop operation causes the selected camera video to appear 
in the selected pane. The position of the map icon is not affected by the drag and drop 
operation. 

In the preferred embodiment two pull down lists are located beneath the map 
pane. A "site" list contains presets and also keeps track of all of the site maps visited 
during the current session and can act as a navigation list. A "map" list allows the user to 
choose from a list of maps associated with the site selected in the site list. 

The control window is divided into multiple sections, including at least the 
following: a control section including logon, site, presets buttons and a real-time clock 
display; a control screen section for reviewing the image database in either a browse or 
preset mode; and a live view mode. In the live and browse modes events can be 
monitored and identified by various sensors, zones may be browsed, specific cameras 
may be selected and various other features may be monitored and controlled. 

The primary screen video window is used to display selected cameras from the 
point-click-and drag feature, the preset system, or the browse feature. This screen and its 
functions also control the- secondary monitor screens. The window is selectively a full 
window, split-window or multiple pane windows and likewise can display one, two or 



multiple cameras simultaneously. The user-friendly camera name is displayed along with 
the camera video. The system is set up so that left clicking on the pane will "freeze- 
frame" the video in a particular pane. Right clicking on the pane will initiate various 
functions. Each video pane includes a drag and drop feature permitting the video in a 
pane to moved to any other pane, as desired. 
[00021] In those monitoring stations having multiple displays, the primary display screen 

described above is also used to control the secondary screens. The secondary screens are 
generally used for viewing selected cameras and are configured by code executing on the 
primary screen. The video pane(s) occupy the entire active video area of the secondary 
screens. 

[000221 The system supports a plurality of cameras and an encoder associated with each 

of the cameras, the high-resolution output signal and low-resolution output signal unique 
to each camera being transmitted to the router. A management system is associated with 
each display monitor whereby each of the plurality of display monitors is adapted for 
displaying any combination of camera signals independently of the other of said plurality 
of display monitors. 

.J00023] The system of includes a selector for selecting between the high-resolution 

output signal and the low-resolution output signal based on the dimensional size of the 
;/ display. The selector may.be adapted for manually selecting between the high-resolution 

; output signal and the low-resolution output signal. Alternatively, a control device may be 

employed for automatically selecting between the high-resolution output signal and the 
low-resolution output signal based on the size of the display. In one aspect of the 
invention, the control device may be adapted to assign a priority to an event captured at a 
camera and selecting between the high-resolution output signal and the low-resolution 
output signal based on the priority of the event. 
[00024] It is contemplated that the system will be used with a plurality of cameras 

and an encoder associated with each of said cameras. The high-resolution output signal 
and low-resolution output signal unique to each camera is then transmitted to a router or 
switch, wherein the display monitor is adapted for displaying any combination of camera 
signals. In such an application, each displayed signal at a display monitor is selected 
between the high-resolution signal and the low-resolution signal of each camera 
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dependent upon the number of cameras signals simultaneously displayed at the display 
monitor or upon the control criteria mentioned above. 

[00025] The video system of the subject invention is adapted for supporting the use of a 

local-area-network (LAN) or wide-area-network (WAN), or a combination thereof, for 
distributing digitized camera video on a real-time or "near" real-time basis. 

[00026] In the preferred embodiment of the invention, the system uses a plurality of video 

cameras, disposed around a facility to view scenes of interest. Each camera captures the 
desired scene, digitizes (and encodes) the resulting video signal, compresses the digitized 
video signal, and sends the resulting compressed digital video stream to a multicast 
address. One or more display stations may thereupon view the captured video via the 
intervening network. 

[00027] Streaming video produced by the various encoders is transported over a generic IP 

network to one or more users. User workstations contain one or more ordinary PC's, 
each with an associated video monitor. The user interface is provided by an HTML 
application within an industry-standard browser, for example Microsoft Internet 
Explorer. 

[00028] The subject invention comprises an intuitive and user-friendly method for 

selecting cameras to view. The main user interface screen provides the user with a map 
of the facility, which is overlaid with camera-shaped icons depicting location and 
direction of the various cameras and encoders. This main user interface has, additionally, 
a section of the screen dedicated to displaying video from the selected cameras. 

[00029] The video display area of the main user interface may be arranged to display a 

single video image, or may be subdivided by the user into arrays of 4, 9, or 16 smaller 
video display areas. 

100030] Selection of cameras, and arrangement of the display area, is controlled by a 

mouse and conventional Windows user-interface conventions. Users may: 

• Select the number of video images to be displayed within the video display 
area. This is done by pointing and clicking on icons representing screens 
with the desired number of images. 
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• Display a desired camera within a desired 'pane' in the video display area. 
This is done by pointing to the desired area on the map, then 'dragging' the 
camera icon to the desired pane. 

• Edit various operating parameters of the encoders. This is done by pointing 
to the desired camera, the right-clicking the mouse. The user interface then 
drops a dynamically generated menu list, which allows the user to adjust the 
desired encoder parameters. 

[00031] 0ne as P ect of the invention is the intuitive and user-friendly method for selecting 

cameras to view. The breadth of capability of this feature is shown in Fig. 3. The main 
user interface screen provides the user with a map of the facility, which is overlaid with 
camera-shaped icons depicting location and direction of the various cameras and 
encoders. This main user interface has, additionally, a section of the screen dedicated to 
displaying video from the selected cameras. 

[00032] The system may employ single or multiple video screen monitor stations. Single- 

monitor stations, and the main or primary monitor in multiple-monitor stations, present a 
different screen layout than secondary monitors in a multiple-monitor system. The main 
control monitor screen is divided into three functional areas: a map pane, a video display 
pane, and a control pane. The map pane displays one or more maps. Within the map 
pane, a specific site may be selected via mouse-click in a drop-down menu. Within the 
map pane, one or more maps relating to the selected site may be selected via mouse-click 
on a drop-down menu of maps. The sensors may be video cameras and may also include 
other sensors such as motion, heat, fire, acoustic sensors and the like. All user screens 
are implemented as HTML or XML pages generated by a network application server. The 
operating parameters of the camera including still-frame capture versus motion capture, 
bit-rate of the captured and compressed motion video, camera name, camera caption, 
camera icon direction in degrees, network address of the various camera encoders, and 
quality of the captured still-frame or motion video. 

[00033] Monitoring stations which employ multiple display monitors use the user interface 

screen to control secondary monitor screens. The secondary monitor screens differ from 
the primary monitor screen in that they do not posses map panes or control panes but are 
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used solely for the purpose of displaying one or more video streams from the cameras. In 
the preferred embodiment the secondary monitors are not equipped with computer 
keyboards or mice. The screen layout and contents of video panes on said secondary 
monitors is controlled entirely by the User Interface of the Primary Monitor. 

The primary monitor display pane contains a control panel comprising a series of 
graphical buttons which allow the user to select which monitor he is currently controlling. 
When controlling a secondary monitor, the video display region of the primary monitor 
represents and displays the screen layout and display pane contents of the selected 
secondary monitor. It is often the case that the user may wish to observe more than 16 
cameras, as heretofore discussed. To support this, the system allows the use of additional 
PC's and monitors. The additional PC's and monitors operate under the control of the 
main user application. These secondary screens do not have the facility map, as does the 
main user interface. Instead, these secondary screens use the entire screen area to display 
selected camera video. These secondary screens would ordinarily be controlled with their 
own keyboard and mouse interface systems. Since it is undesirable to clutter the user's 
workspace with multiple input interface systems, these secondary PC's and monitors 
operate entirely under the control of the main user interface. To support this, a series of 
button icons are displayed on the main user interface, labeled, for example, PRIMARY, 
2,3, and 4. The video display area of the primary monitor then displays the video that 
will be displayed on the selected monitor. The primary PC, then, may control the 
displays on the secondary monitors. For example, a user may click on the '2' button, 
which then causes the primary PC to control monitor number two. When this is done, the 
primary PC's video display area also represents what will be displayed on monitor 
number two. The user may then select any desired camera from the map, and drag it to a 
selected pane in the video display area. When this is done, the selected camera video will 
appear in the selected pane on screen number 2. Streaming video signals tend to be 
bandwidth-intensive. Furthermore, since each monitor is capable of displaying up to 16 
separate video images, the bandwidth requirements of the system can potentially be 
enormous. It is thus desirable to minimize the bandwidth requirements of the system.To 
address this, each encoder is equipped with at least two MPEG-1 encoders. When the 
encoder is initialized, these two encoders are programmed to encode the same camera 
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source into two distinct streams: one low-resolution low-bit rate stream, and one higher- 
resolution, higher-bit rate stream. When the user has configured the video display area to 
display a single image, that image is obtained from the desired encoder using the higher- 
resolution, higher-bit rate stream. The same is true when the user subdivides the video 
display area into a 2 x 2 array; the selected images are obtained from the high-resolution, 
high-bit rate streams from the selected encoders. The network bandwidth requirements 
for the 2 x 2 display array are four times the bandwidth requirements for the single 
image, but this is still an acceptably small usage of the network bandwidth.However, 
when the user subdivides a video display area into a 3 x 3 array, the demand on network 
bandwidth is 9 times higher than in the single-display example. And when the user 
subdivides the video display area into a 4 x 4 array, the network bandwidth requirement 
is 16 times that of a single display. To prevent network congestion, video images in a 3 x 
3 or 4 x 4 array are obtained from the low-resolution, low-speed stream of the desired 
encoder. Ultimately, no image resolution is lost in these cases, since the actual displayed 
video size decreases as the screen if subdivided. That is, if a higher-resolution image 
were sent by the encoder, the image would be decimated anyway in order to fit it within 
the available screen area.lt is, therefore, an object and feature of the subject invention to 
provide the means and method for displaying "live" streaming video over a commercially 
available media player system.lt is a further object and feature of the subject invention to 
provide the means and method for permitting multiple users to access and view the live 
streaming video at different time, while in process without interrupting the transmission. 

[000351 It is a further object and feature of the subject invention to permit conservation of 

bandwidth by incorporating a multiple resolution scheme permitting resolution to be 
selected dependent upon image size and use of still versus streaming images. 

[000361 It is an additional object and feature of the subject invention to provide a user- 

friendly screen interface permitting a user to select, control and operate the system from a 
single screen display system. 

[000371 ft i s a further object and feature of the subject invention to permit selective 

viewing of a mapped zone from a remote station. 

[00038] It - g aether object and feature of the subject invention to provide for camera 

selection and aiming from a remote station. 
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[00039] Other objects and feature of the subject invention will be readily apparent from 

the accompanying drawings and detailed description of the preferred embodiment. 
Brief Description of the Drawings 

[00040] Fig. 1 is a block diagram of a typical multi-camera system in accordance with the 

subject invention. 

[00041] Fig. 2 is an illustration of the scheme for multicast address resolution. 

[00042] Fig. 3 illustrates a typical screen layout. 

j 00043 j Fig. 4 is an illustration of the use of the bandwidth conservation scheme of the 

subject invention. 

[00044] Fig. 5 is an illustration of the user interface for remote control of camera 

direction. 

[00045] Fig. 6 is an illustration of the user interface for highlighting, activating and 

displaying a camera signal. 

[00046] Fig. 7 is an illustration of the multiple screen layout and setup. 

[00047] Fig- 8 is an illustration of the dynamic control of screens and displays of various 

- i cameras using the user interface scheme of the subject invention. 

Detailed Description of the Preferred Embodiment 

fop048] One aspect of the invention is the intuitive and user-friendly method for selecting 

cameras to view. The breadth of capability of this feature is shown in Fig. 3. The main 
user interface screen provides the user with a map of the facility, which is overlaid with 
camera-shaped icons depicting location and direction of the various cameras and 
encoders. This main user interface has, additionally, a section of the screen dedicated to 
displaying video from the selected cameras. 

j 00049 j The video display area of the main user interface may be arranged to display a 

single video image, or may be subdivided by the user into arrays of 4, 9, or 16 smaller 
video display areas. Selection of cameras, and arrangement of the display area, is 
controlled by the user using a mouse and conventional Windows user-interface 
conventions. Users may: 

• Select the number of video images to be displayed within the video 
display area. This is done by pointing and clicking on icons representing screens with the 
desired number of images. 
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• Display a desired camera within a desired 'pane' in the video display area. 
This is done by pointing to the desired area on the map, then 'dragging' the camera icon 
to the desired pane. 

• Edit various operating parameters of the encoders. This is done by 
pointing to the desired camera, the right-clicking the mouse. The user interface then 
drops a dynamically generated menu list that allows the user to adjust the desired encoder 
parameters. 

[00050] The video surveillance system of the subject invention is specifically adapted for 

distributing digitized camera video on a real-time or near real-time basis over a LAN 
and/or a WAN. As shown in Fig. 1, the system uses a plurality of video cameras CI, 
C2.. .Cn, disposed around a facility to view scenes of interest. Each camera captures the 
desired scene, digitizes the resulting video signal at a dedicated encoder module El, 
E2. . .En, respectively, compresses the digitized video signal at the respective compressor 
PI, P2...Pn, and sends the resulting compressed digital video stream to a multicast 
address router R. One or more display stations Dl, D2...Dn may thereupon view the 
captured video via the intervening network N. The network may be hardwired or 
wireless, or a combination, and may either a Local Area Network (LAN) or a Wide Area 
Network (WAN), or both. 

J00051] The preferred digital encoders El, E2...En produce industry-standard MPEG-1 

digital video streams. The use of MPEG-1 streams is advantageous due to the low cost of 
the encoder hardware, and to the ubiquity of software MPEG-1 players. 

[00052] On desktop computers, it is common practice to play MPEG-1 video and audio 

using a proprietary software package such as, by way of example, the Microsoft 
Windows Media Player. This software program may be run as a standalone application, 
otherwise components of the player may be embedded within other software applications. 

[00053] Any given source of encoded video may be viewed by more than one client. This 

could hypothetically be accomplished by sending each recipient a unique copy of the 
video stream. However, this approach is tremendously wasteful of network bandwidth. 
A superior approach is to transmit one copy of the stream to multiple recipients, via 
Multicast Routing. This approach is commonly used on the Internet, and is the subject of 
various Internet Standards ( RFC's). In essence, a video source sends its' video stream to 
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a Multicast Group Address, which exists as a port on a Multicast-Enabled network router 
or switch. The router or switch then forwards the stream only to IP addresses, which 
have known recipients. Furthermore, if the router or switch can determine that multiple 
recipients are located on one specific network path or path segment, the router or switch 
sends only one copy of the stream to that path. 

From a client's point of view, the client need only connect to a particular 
Multicast Group Address to receive the stream. A range of IP addresses has been 
reserved for this purpose; essentially all IP addresses from 224.0.0.0 to 239.255.255.255 
have been defined as Multicast Group Addresses. 

Unfortunately, there is not currently a standardized mechanism to dynamically 
assign these Multicast Group Addresses, in a way that is known to be globally unique. 
This differs from the ordinary Class A, B, or C IP address classes. In these classes, a 
regulatory agency assigns groups of IP addresses to organizations upon request, and 
guarantees that these addresses are globally unique. Once assigned this group of IP 
addresses, a network administrator may allocate these addresses to individual hosts, 
either statically or dynamically DHCP or equivalent network protocols. This is not true 
of Multicast Group Addresses; they are not assigned by any centralized body and their 
usage is therefore not guaranteed to be globally unique. 

Each encoder must possess two unique IP addresses - the unique Multicast 
Address used by the encoder to transmit the video stream, and the ordinary Class A, B, or 
C address used for more mundane purposes. It is thus necessary to provide a means to 
associate the two addresses, for any given encoder. 

The subject invention includes a mechanism for associating the two addresses. 
This method establishes a sequential transaction between the requesting client and the 
desired encoder. An illustration of this technique is shown in Fig. 2. 

First, the client requesting the video stream identifies the IP address of the desired 
encoder. This is normally done via graphical methods, described more fully below. 
Once the encoder's IP address is known, the client obtains a small file from an associated 
server, using FTP, TFTP or other appropriate file transfer protocol over TCP/IP. The file, 
as received by the requesting client, contains various operating parameters of the encoder 
including frame rate, UDP bit rate, image size, and most importantly, the Multicast 

n 
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Group Address associated with the encoder's IP address. The client then launches an 
instance of Media Player, initializes the previously described front end filter, and directs 
Media Player to receive the desired video stream from the defined Multicast Group 
Address. 

[00059] Streaming video produced by the various encoders is transported over a generic IP 

network to one or more users. User workstations contain one or more ordinary PC's, 
each with an associated video monitor. The user interface is provided by an HTML 
application within an industry-standard browser, specifically Microsoft Internet Explorer. 

[00060] Some sample source is listed below: 

// this function responds to a dragStart event on a camera 

function cameraDragStart (i) 

{ 

event . dataTransfer . set Data ( "text", currSite . siteMaps [currSite . currMap] . h 
otSpots [i] . camera . id) ; 

dragSpot = currSite . siteMaps [currSite . currMap] . hotSpots [i] ; 

event .dataTransfer. dropEf feet = "copy"; 

dragging = true; 

event . cancelBubble = true; 

} 

// this function responds to a dragStart event on a cell 
// we might be dragging a hot Spot or a zone 
yj function cellDragStart (i) 

{ 

} 
} 

// this function responds to a drop event on a cell input element 

function drop(i) 

{ 

if (dragSpot != null) // dragging 

a hotSpot 
{ 

} 

else if (dragZone != null) // dragging 

a zone object 
{ 

currMonitor . zones [i] = dragZone; // set the cell 

zone 

dragZone = null; // 
null dragZone 

zoneVideo (currMonitor . id, i) ; // start 

the video 
} 

else 



dropCamerald (currMonitor, d, i) ; // setup hotSpot 
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startMonitorVideo (currMonitor, i) 

the video 

displayCells ( ) ; 
redisplay the monitor cells 
} 

} 

dragging = false; 
event . cancelBubble = true; 

} 

[00061] In the foregoing code, the function: 

event, data Transfer. setData ("text " currSite. siteMapsf currSite. currMap J. hotspots 
[i] ' .camera.id) 

retrieves the IP address of the encoder that the user has clicked. The subsequent function 
startMonitorVideo(currMonitor, i) passes the IP address of the selected encoder to an 
ActiveX control that then decodes and renders video from the selected source. 

$0062] The system of includes a selector for selecting between the high-resolution output 

signal and the low-resolution output signal based on the dimensional size of the display. 
The selector may be adapted for manually selecting between the high-resolution output 
signal and the low-resolution output signal. Alternatively, a control device may be 
employed for automatically selecting between the high-resolution output signal and the 
low-resolution output signal based on the size of the display. In one aspect of the 
invention, the control device may be adapted to assign a priority to an event captured at a 
camera and selecting between the high-resolution output signal and the low-resolution 
output signal based on the priority of the event. 

[00063] It is contemplated that the system will be used with a plurality of cameras and an 

encoder associated with each of said cameras. The high-resolution output signal and low- 
resolution output signal unique to each camera is then transmitted to a router or switch, 
wherein the display monitor is adapted for displaying any combination of camera signals. 
In such an application, each displayed signal at a display monitor is selected between the 
high-resolution signal and the low-resolution signal of each camera dependent upon the 
number of cameras signals simultaneously displayed at the display monitor or upon the 
control criteria mentioned above. 



// start 
// 
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1 J It is often the case that the user may wish to observe more than 16 cameras, as 

heretofore discussed. To support this, the system allows the use of additional PC's and 
monitors. The additional PC's and monitors operate under the control of the main user 
application. These secondary screens do not have the facility map, as does the main user 
interface. Instead, these secondary screens use the entire screen area to display selected 
camera video. 

[00065] These secondary screens would ordinarily be controlled with their own keyboards 

and mice. Since it is undesirable to clutter the user's workspace with multiple mice, 
these secondary PC's and monitors operate entirely under the control of the main user 
interface. To support this, a series of button icons are displayed on the main user 
interface, labeled, for example, PRIMARY, 2,3, and 4. The video display area of the 
primary monitor then displays the video that will be displayed on the selected monitor, 
o The primary PC, then, may control the displays on the secondary monitors. For example, 

a user may click on the '2' button, which then causes the primary PC to control monitor 
nil number two. When this is done, the primary PC's video display area also represents what 

h| will be displayed on monitor number two. The user may then select any desired camera 

from the map, and drag it to a selected pane in the video display area. When this is done, 
!; the selected camera video will appear in the selected pane on screen number 2. 

L [00066] Streaming video signals tend to be bandwidth-intensive. The subject invention 

[ll provides a method for maximizing the use of available bandwidth by incorporating 

multiple resolution transmission and display capabilities. Since each monitor is capable 
of displaying up to 16 separate video images, the bandwidth requirements of the system 
can potentially be enormous. It is thus desirable to minimize the bandwidth requirements 
of the system. 

[00067] To address this, each encoder is equipped with at least two MPEG-1 encoders. 

When the encoder is initialized, these two encoders are programmed to encode the same 
camera source into two distinct streams: one low-resolution low-bit rate stream, and one 
higher-resolution, higher-bit rate stream. 

[00068] When the user has configured the video display area to display a single image, 

that image is obtained from the desired encoder using the higher-resolution, higher-bit 
rate stream. The same is true when the user subdivides the video display area into a 2 x 2 
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array; the selected images are obtained from the high-resolution, high-bit rate streams 
from the selected encoders. The network bandwidth requirements for the 2x2 display 
array are four times the bandwidth requirements for the single image, but this is still an 
acceptably small usage of the network bandwidth. 

[00069] However, when the user subdivides a video display area into a 3 x 3 array, the 

demand on network bandwidth is 9 times higher than in the single-display example. And 
when the user subdivides the video display area into a 4 x 4 array, the network bandwidth 
requirement is 16 x that of a single display. To prevent network congestion, video 
images ina3x3or4x4 array are obtained from the low-resolution, low-speed stream of 
the desired encoder. Ultimately, no image resolution is lost in these cases, since the 
actual displayed video size decreases as the screen if subdivided. If a higher-resolution 
image were sent by the encoder, the image would be decimated anyway in order to fit it 
within the available screen area. 

[4)0070] The user interface operations are shown in Figs. 5-8. In general, interface 

functions of the system interact with the system through the browser. Initially, a splash 
screen occurs, containing the login dialog. A check box is provided to enable an 
automatic load of the user's last application settings. After logon, the server loads a 
series of HTML pages, which, with the associated scripts and applets, provide the entire 
user interface. Users equipped with a single-monitor system interact with the system 
entirely through the primary screen. Users may have multiple secondary screens, which 
are controlled by the primary screen. In the preferred embodiment the primary screen is 
divided into three windows: the map window; the video window and the control 
window. 

[00071] The primary screen map window contains a map of the facility and typically is a 

user-supplied series of one or more bitmaps. Each map contains icons representing 
cameras or other sensor sites. Each camera/sensor icon represents the position of the 
camera within the facility. Each site icon represents another facility or function site 
within the facility. In addition, camera icons are styled so as to indicate the direction the 
camera is pointed. When a mouse pointer dwells over a camera icon for a brief, 
predefined interval, a "bubble" appears identifying the camera. Each camera has an 
associated camera ID or camera name. Both of these are unique alphanumeric names of 
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20 characters or les and are maintained in a table managed by the server. The camera ID 
is used internally by the system to identify the camera and is not normally seen by the 
user. The camera name is a user-friendly name, assigned by the user and easily 
changeable from the user screen. Any user with administrator privileges may change the 
camera name. 

In the preferred embodiment, the map window is a pre-defined size, typically 510 
pixels by 510 pixels. The bit map may be scaled to fit with the camera icons accordingly 
repositioned. 

When the mouse pointer dwells over a camera icon for a brief time, a bubble 
appears which contains the camera name. If the icon is double left clicked, then that 
camera's video appears on the primary screen video window in a full screen view. If the 
icon is right clicked, a menu box appears with further options such as: zone set up; 
camera set up; and event set up. 

When the mouse pointer dwells of a site or sensor icon for a brief time a bubble 
appears with the site or sensor name. When the icon is double left clicked, the linked site 
is loaded into the primary screen with the previous site retained as a pull down. Finally, 
the user may drag and drop a camera icon into any unused pane in the primary screen 
video window. The drag and drop operation causes the selected camera video to appear 
in the selected pane. The position of the map icon is not affected by the drag and drop 
operation. 

In the preferred embodiment two pull down lists are located beneath the map 
pane. A "site" list contains presets and also keeps track of all of the site maps visited 
during the current session and can act as a navigation list. A "map" list allows the user to 
choose from a list of maps associated with the site selected in the site list. 

The control window is divided into multiple sections, including at least the 
following: a control section including logon, site, presets buttons and a real-time clock 
display; a control screen section for reviewing the image database in either a browse or 
preset mode; and a live view mode. In the live and browse modes events can be 
monitored and identified by various sensors, zones may be browsed, specific cameras 
may be selected and various other features may be monitored and controlled. 



[00077] The primary screen video window is used to display selected cameras from the 

point-click-and drag feature, the preset system, or the browse feature. This screen and its 
functions also control the secondary monitor screens. The window is selectively a full 
window, split-window or multiple pane windows and likewise can display one, two or 
multiple cameras simultaneously. The user-friendly camera name is displayed along 
with the camera video. The system is set up so that left clicking on the pane will "freeze- 
frame" the video in a particular pane. Right clicking on the pane will initiate various 
functions. Each video pane includes a drag and drop feature permitting the video in a 
pane to moved to any other pane, as desired. 

[00078] In those monitoring stations having multiple displays, the primary display screen 

described above is also used to control the secondary screens. The secondary screens are 
generally used for viewing selected cameras and are configured by code executing on the 
primary screen. The video pane(s) occupy the entire active video area of the secondary 
screens. 

[00079] The system supports a plurality of cameras and an encoder associated with each 

of the cameras, the high-resolution output signal and low-resolution output signal unique 
'= to each camera being transmitted to the router. A management system is associated with 

each display monitor whereby each of the plurality of display monitors is adapted for 
displaying any combination of camera signals independently of the other of said plurality 
of display monitors. 

= [00080] with specific reference to Fig. 5, the display screen 100 for the primary monitor 

screen is subdivided into three areas or zones, the map zone 102, the video display zone 
104 and the control panel or zone 106. In the illustrated figure, the display zone is 
divided into a split screen 104a and 104b, permitting the video from two cameras to be 
simultaneously displayed. As previously stated, the display zone can be a full screen, 
single camera display, split screen or multiple (window pane) screens for displaying the 
video from a single or multiple cameras. The map zone 102 includes a map of the facility 
with the location and direction of cameras CI, C2, C3 and C4 displayed as icons on the 
map. The specific cameras displayed at the display screen are shown in the display 
window, here cameras CI and C3. If different cameras are desired, the user simply 
places the mouse pointer on a camera in the map, clicks and drags the camera to a screen 
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and it will replace the currently displayed camera, or the screen may be reconfigured to 
include empty panes. 

100081] con trol panel 106 has various functions as previously described. As shown in 

Fig. 5, the control panel displays the camera angle feature. In this operation, the selected 
camera (CI, C2, C3 or C4) is selected and the camera direction (or angle) will be 
displayed. The user then simply changes the angle as desired to select the new camera 
direction. The new camera direction will be maintained until again reset by the user, or 
may return to a default setting when the user logs off, as desired. 

[00082] Fig. 7 illustrated the primary screen 100 with the map zone 102 and with the 

viewing zone 104 now reconfigured into a four pane display 104a, 104b, 104c, 104d. 
The control panel 106 is configured to list all of the cameras (here cameras CI, C2 and 
C3). The user may either point and click on a camera in the map and the camera will be 
highlighted on the list, or vise versa, the user may highlight a camera on the list and it 
will flash on the map. The desired camera may then be displayed in the viewing 
windows by the previously described drag-and-click method. 

[O0083] Fig. 7 shows a primary monitor 100 in combination with one or more secondary 

monitors 108 and 110. The primary monitor includes the map zone 102, the display zone 
104 and the control panel 106 as previously described. As shown in a partial enlarged 
view, the control panel will include control "buttons" 112 for selecting the various 
primary "P" and numbered secondary monitors. Once a monitor is selected, the display 
configuration may then be selected ranging from full screen to multiple panes. Thus each 
monitor can be used to display different configurations of cameras. For example, in 
practice it is desirable that the primary monitor is used for browsing, while one secondary 
monitor is a full screen view of a selected camera and a second secondary monitor is 
divided into sufficient panes to display all cameras on the map. This is further 
demonstrated in Fig. 8. 

[ 0008 4] The system of the present invention greatly enhances the surveillance capability 

of the user. The map not only permits the user to determine what camera he is looking at 
but also the specific direction of the camera. This can be done by inputting the angular 
direction of the camera, as indicated in Fig. 5, or by rotating the camera icon with the 
mouse, or by using an automatic panning head on the camera. When using the panning 
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head, the head is first calibrated to the map by inputting a reference direction in degrees 
and by using the mouse on the map to indicate a defined radial using the camera as the 
center point. 

[00085] The camera icon on the map can be used to confirm that a specific camera has 

been selected by hovering over a pane in the selected screen (whole, split or multiple), 
whereby the displayed video will be tied to a highlighted camera on the map. The mouse 
pointer can also be used to identify a camera by pointing to a camera on the sensor list, 
also causing the selected camera to be highlighted on the map zone. When automatic 
event detection is utilized, an event detection sensor will cause a camera to be activated, 
it will then be highlighted on the map and displayed on the video display zone. Event 
detection can include any of a number of event sensors ranging from panic buttons to fire 
detection to motion detection and the like. Where desired, different highlighting colors 
may be used to identify the specific event causing the camera activation. 
[00086] The screen configuration may be by manual select or automatic. For example, a 

= = " number of cameras may be selected and the screen configuration may be set to display 

" the selected number of cameras in the most efficient configuration. This can be 

accomplished by clicking on the camera icons on the map, selecting the cameras from the 
sensor list, or typing in the selected cameras. In the most desired configuration, an event 
detection will automatically change the display configuration of the primary screen to 
immediately display the video from a camera experiencing an event phenomenon. 
Cameras may also be programmed to be displayed on a cyclical time sequenced or other 
pre-programmed conditions, including panning, by way of example. 
[00087] Specifically, the screen configuration is dynamic and can be manually changed or 

changed automatically in response to the detection of events and conditions or through 
programming. 

[00088] One aspect of the invention is the intuitive and user-friendly method for selecting 

cameras to view. The breadth of capability of this feature is shown in Fig. 3. The main 
user interface screen provides the user with a map of the facility, which is overlaid with 
camera-shaped icons depicting location and direction of the various cameras and 
encoders. This main user interface has, additionally, a section of the screen dedicated to 
displaying video from the selected cameras. 
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[° 009 °] The video display area of the main user interface may be arranged to display a 

single video image, or may be subdivided by the user into arrays of 4, 9, or 16 smaller 
video display areas. Selection of cameras, and arrangement of the display area, is 
controlled by the user using a mouse and conventional Windows user-interface 
conventions. Users may: 

• Select the number of video images to be displayed within the video 
display area. This is done by pointing and clicking on icons representing screens with the 
desired number of images. 

• Display a desired camera within a desired 'pane' in the video display area. 
This is done by pointing to the desired area on the map, then 'dragging' the camera icon 
to the desired pane. 

• Edit various operating parameters of the encoders. This is done by 
pointing to the desired camera, the right-clicking the mouse. The user interface then 
drops a dynamically generated menu list that allows the user to adjust the desired encoder 

.=-- parameters. 

[00091] While specific features and embodiments of the invention have been described in 

^ detail herein, it will be understood that the invention includes all of the enhancements and 

modifications within the scope and spirit of the following claims. 
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